Preparing Data For The Data Lake

نویسندگان

  • Alekh Jindal
  • Samuel Madden
چکیده

Data preparation is increasingly becoming one of the biggest challenges in processing big data. While recent tools such as Tamer and Trifacta address the problem of integrating and cleaning the datasets as they come in, preparing these datasets for efficient processing over a variety of query workloads is still challenging. In this talk, I will discuss these challenges and describe our tool which allows for fine-grained data preparation, via a data preparation plan, and efficiently runs this plan while uploading the data to HDFS.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

گاهنگاری مطلق (14C) و نسبی محوطه دوه‌گُز خوی با استفاده از روش طیف‌سنج جرمی شتاب دهنده (AMS)؛ شواهدی از دوره مس‌و‌سنگ انتقالی، دالما و پیزدلی

The settlement of Dava Göz situated about 15km SW of Khoy and 1.5km north of the Dizaj Diz town in NW Iran. Dava Göz is a small site at north of the Lake Urmia, measuring about 100×100m (ca. 1ha). The first season of archaeological excavation primarily aimed to clarifying the chronology, settlement organization, and respond to some of the fundamental questions such as the transi...

متن کامل

Assessment of Vegetation Temperature Status (VTCI) for monitoring drought in the watershed of Lake Urmia by using MODIS satellite imagery

Continuous decline in Lake Urmia water levels In recent years, the decline of rainfall and river flows and constant droughts has become the main concern of the people and the people. To study climate change and increase of temperature in the catchment area of ​​Lake Urmia, two factors for measuring the temperature and properties of satellite images were used which indicate the importance of lan...

متن کامل

Analysis of the Spell of Rainy Days in Lake Urmia Basin using Markov Chain Model

In this study, the Frequency and the spell of rainy days was analyzed in Lake Uremia Basin using Markov chain model. For this purpose, the daily precipitation data of 7 synoptic stations in Lake Uremia basin were used for the period 1995- 2014. The daily precipitation data at each station were classified into the wet and dry state and the fitness of first order Markov chain on data series was e...

متن کامل

بررسی اثر تغییر پوشش سطحی بر تغییرات رژیم دما، بارش و رطوبت در بخش‌های شرقی و غربی دریاچه ارومیه

Introduction Many factors controlling atmospheric parameters, includes lake and seas, mountains, urban areas and etc. One of this factors is land cover. If the land cover is changing, then the nature of climate will change. One of the important part of land cove are aquatic bodies. Lakes are considered as natural controls on temperature, rain fall and moisture regime in interior continents. ...

متن کامل

A study of the effect of changes in the area of Maharlu lake on climatic parameters of Shiraz and on land surface temperature of its surrounding areas

Remote sensing is increasingly used in studies of periodic changes of land use and landsurface temperature (LST) calculations. In this paper, the effect of change in the area ofMaharlu Lake on climatic elements, land surface temperature and vegetation cover in theareas surrounding the lake were studied. To this end, the ETM + & TM sensor data ofLANDSAT satellite on May 22, 1987, May 17, 2000, M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015